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ABSTRACT 


The  general  linear  model  of  statistical  inference  is 
formulated  in  terms  of  the  Moore-Penrose  generalized  inverse. 
The  matrix  algebra  of  the  generalized  inverse  which  is  es- 
sential to  the  model  is  presented.   A  methodology  for  estima- 
tion and  hypothesis  testing  is  derived  which  permits  identical 
manipulation  of  both  the  fulj^^j^^and  the  less-than-full  rank 
cases  of  the  model. 
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I .   INTRODUCTION 

The  linear  statistical  model  is  a  mathematical  formula- 
tion which  is  useful  in  the  interpretation  of  data  and  ob- 
servations from  experiments.   While  the  assumptions  necessary 
for  the  use  of  a  linear  statistical  model  restrict  its  appli- 
cation, it  is  frequently  convenient  and  appropriate  to  use 
such  a  model  because  of  its  simplicity  and  its  suitability 
to  many  processes  and  experiments  under  investigation. 

This  paper  will  be  concerned  with  one  of  the  most  fre- 
quently utilized  linear  statistical  models.   It  is  assumed 
that  n  observations,  Y.,  are  made  of  a  process  or  experimen- 
tal quantity.   The  process  or  experiment  has  p  elements,  X., 
each  of  which  has  a  fixed  value  for  each  of  n  replications. 
Associated  with  the  vector  of  observations  is  a  vector  of 
errors,  denoted  by  e.   The  errors  are  assumed  to  be  uncor- 
rected between  replications  and  have  a  multivariate  distri- 
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bution  with  mean  vector  0  and  variance-covariance  matrix  a  I. 

In  matrix  notation,  the  model  is  expressed  as 

Y  =  X3  +  e  , 

where  Y  is  an  n  x  1  vector  of  observations  with  mean  vector 
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X$  and  variance-covariance  matrix  a  I.   X  is  an  n  x  p  matrix 

of  known  constants ,  and  3  is  a  p  x  1  vector  of  unknown  param- 
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eters.   o   is  the  unknown  variance  of  the  individual  observa- 
tions . 

By  the  use  of  the  general  linear  statistical  model,  one 
can  estimate  the  assumed  functional  relationship  between  Y 

and  X.   If  the  matrix  X  is  of  full  rank,  we  accomplish  this 

2 

by  estimation  of  a   and  3.   In  the  less-than-full  rank  case, 


2 
we  concern  ourselves  with  estimation  of  c   and  F'3,  sets  of 

linearly  independent  estimable  functions.   Several  tests  of 
hypotheses  are  frequently  useful  in  the  evaluation 
of  the  functional  relationships  in  the  model.   Such  tests  pro- 
vide conclusions  concerning  the  values  of  the  unknown  param- 
eters which  would  be  useful  in  predicting  values  of  Y  or  in 
explaining  the  variability  of  Y. 

This  paper  discusses  the  mathematical  manipulation  of 
a  linear  model  by  which  conclusions  regarding  the  process 
under  investigation  might  be  reached.   The  general  linear 
statistical  model  has  traditionally  been  treated  as  two  sep- 
arate cases,  with  two  different  methodologies.   These  sep- 
arate methodologies  are  necessitated  by  two  general  forms 
which  the  matrix  X  may  assume.   These  forms  are  the  full  rank 
form,  where  the  n  x  p  matrix  X  is  of  rank  p,  and  the  less- 
than-full  rank  form  where  X  is  of  rank  r  <  p.   The  applica- 
tion of  the  general  linear  model  to  experimental  design  fre- 
quently gives  us  the  less-than-full  rank  case,  whereas  in  the 
regression  analysis  application,  the  full  rank  case  is  most 
often  encountered. 

As  an  example  of  the  apparent  necessity  for  different 
methodologies,  consider  estimation  of  the  vector  of  unknown 
parameters.   The  method  of  least  squares  yields  the  set  of 
equations , 

X'X3  =  X'Y  , 
where  3  is  the  least  squares  estimator  of  the  vector  3.   If 
X  is  of  full  rank,  a  unique  solution  for  3  exists,  because 
X'X  has  an  inverse.   If,  however,  the  rank  of  X  is  r  <  p,  a 


unique  solution  does  not  exist,  and  reparametrization  is  com^ 
monly  used  to  estimate  the  invariant  features  of  the  solution 
vectors,  3*   Reparametrization  involves  the  linear  transforma- 
tion of  the  vector  3  into  the  vector  a  and  the  consequent 
change  of  the  model  from 

Y  =  X3  +  e 
to 

Y  =  Za  +  e  , 

where  Z  is  a  matrix  of  full  rank.   The  mathematical  manipula- 
tion of  this  model  can  then  be  carried  out  in  much  the  same 
fashion  as  the  manipulation  of  the  full  rank  case.   The  lack 
of  a  unified  methodology  led  the  author  to  consider  an  ap- 
proach by  which  both  the  full  rank  and  the  less-than-full  rank 
cases  could  be  manipulated  by  the  same  mathematical  technique. 

In  the  chapters  to  follow,  manipulations  of  the  general 
linear  statistical  model  will  be  formulated  in  terms  of  the 
Moore-Penrose  generalized  inverse.   The  definition  of  the 
generalized  inverse  will  be  taken  as  the  point  of  departure 
for  Chapter  II.   A  survey  of  the  properties  of  the  generalized 
inverse  which  are  preliminary  to  the  statistical  formulation 
will  be  presented. 

In  Chapter  III  the  estimation  of  parameters  and  functions 
of  parameters,  and  the  distribution  of  the  estimates  will  be 
discussed.   The  formulation  and  testing  of  hypotheses  will 
be  examined  in  Chapter  IV.   For  a  part  of  tne  third  chapter 
and  the  entire  fourth  chapter  it  will  be  necessary  to  specify 
a  multivariate  distribution  for  the  vector  of  errors.   We 
shall  assume  that  e  has  the  multivariate  normal  distribution; 
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that  is,  e^'MVN(0,a  I).   This  is  a  commonly  assumed  distribu- 
tion, as  it  is  justifiable  for  observational  errors  in  a 
wide  range  of  processes  and  experiments. 

Appendix  A  is  composed  of  a  simple  numerical  example  of 
the  unified  methodology  derived  in  the  preceding  chapters. 
An  experimental  design  application  of  the  general  linear  sta- 
tistical model  has  been  chosen  to  illustrate  the  use  of  the 
method  in  the  less-than-full  rank  case. 

Throughout  the  development,  it  will  be  noted  that  strong 
reliance  has  been  placed  upon  the  notation,  terminology,  and 
methods  of  proof  of  Graybill  (1).   This  is  due,  in  part,  to 
Graybill's  consistency  with  the  generalized  inverse  formula- 
tion.  Equally  important,  however,  is  the  simplicity  with 
which  Graybill  has  presented  the  basic  statistical  theory. 


II.   THE  GENERALIZED  INVERSE 
The  mathematical  concept  known  as  the  generalized  in- 
verse of  a  matrix  was  first  introduced  by  E.  H.  Moore  in  1920. 
Moore  (2)  presented  the  first  published  systematic  investiga- 
tion of  the  properties  of  the  generalized  inverse  in  1935. 
The  concept  was  not  widely  recognized  until  1955  when  R.  A« 
Penrose  (3)  independently  rediscovered  it. 

The  Moore-Penrose  generalized  inverse  of  an  arbitrary 
matrix  A  has  been  defined  by  Penrose  (3)  as  the  solution, 
G  =  A   of  the  following  four  equations. 
AGA  =  A  (2.1) 

GAG  =  G  (2.2) 

(AG) '  =  AG  (2.3) 

(GA) '  =  GA  (2.4) 

It  is  convenient  to  describe  the  generalized  inverse  of 
an  arbitrary  matrix  by  two  equations  which  are  equivalent  to 
the  defining  equations .   These  equations  can  be  formed  by 
substituting  eqn.  (2„3)  into  eqn.  (2.2),  and  eqn.  (2.4)  into 
eqn.  (2.1).   Respectively,  these  relationships  are 

GG'A'  =  G  (2.5) 

and 

AA'G'  =  A  (2.6) 

There  are  several  properties  of  the  generalized  inverse 
which  will  be  useful  in  developing  the  statistical  aspects 
of  this  paper.   These  properties  are  stated  below  with  source 
references,  when  appropriate,  where  a  detailed  development 
of  the  property  may  be  found. 


Property  (1) .   The  generalized  inverse  specified  by 

the  four  defining  equations  is  unique.  [Penrose  (3)] 

++ 
Property  (2) .   A   =  A.   [Penrose  (3)  ] 

+  '     '  + 

Property  (3) .   A   =  A      [Penrose  (3)] 

Property  (4).   A+  =  A'A+  A+.   [Penrose  (3)] 

+     +     +  + 

Property  (5) .   A  A,  M  ,  I-A  A,  and  I-AA   are  each 

idempotent.  [Penrose  (3)] 

Property  (6) .   Rank  (A  )  =  rank  (A)  =  rank  (A  a)  = 

+  + 

rank  (AA  )  =  trace  (A  A).   [Penrose  (3)] 

+    +  +  + 

Property  (7) .   (AB)   =  B  A   if,  and  only  if,  A  A 

and  BB'  commute ,  and  A'A  and  BB   commute.  [Greville  (4)] 

Property  (8) .   A  necessary  and  sufficient  condition 

.  that  AXB  =  C  has  a  solution  for  X  is  AA+CB+B  =  C. 

+   +        +    + 
The  general  solution  is  X  =  A  CB   +Z-A  AZBB  ,  where 

Z  is  arbitrary.   [Penrose  (3)] 

Property  (9) .   If  A  is  an  n  x  p  matrix  of  rank  p. 

(A'A)+  =  (A'A)"1. 

Proof:   (A'A)  has  an  inverse,  for  it  is  a  p  x  p  matrix 
of  rank  p.   It  may  easily  be  verified  that  (A'A)    satisfies 
the  defining  equations  for  the  generalized  inverse  of  (A'A). 
As  the  generalized  inverse  is  unique,  (A'A)   =  (A'A) 

Property  (10) .   If  A  is  an  n  x  p  matrix  of  rank  p, 

A+A  =  I. 

Proof:   We  know  from  matrix  theory  that  if  A  is  an 
n  x  p  matrix  of  rank  p,  then  A  has  a  left  inverse.   It  can 
be  shown  by  substitution  into  the  defining  equations  that 

the  left  inverse  if  A  .   The  right  inverse  result  for  a 

+ 
p  x  n  matrix  of  rank  p  is,  similarly,  AA  =  I. 
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Property  (11) .   If  a  matrix  A  is  of  full  rank  and 
is  partitioned  such  that  A  =  (A,  ;  A2)  then 


A*"  = 


Al  "  AiA2[(I-AiAi)A2]+ 


[ (i-a1a|)a2]+ 


[Cline  (5)] 


In  particular,  it  is  noted  that  when  A, 'A~  =  0,  then 

+' 


A+  = 


+  +   + 

since  A,A2  =  (A,A'A ,')A2  =  0. 
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III.   ESTIMATION 
Making  the  assumptions  for  the  general  linear  statis- 
tical model , 

Y  =  X3  +  e, 

2 
where  E(e)  =  0  and  E(ee')  =  a  I,  provides  us  with  a  basis  for 

estimation  of  the  vector  of  unknown  parameters  and  functions 
of  the  parameters.  Estimation  of  these  quantities  is  equiv- 
alent to  estimation  of  the  relationships  which  exist  between 
the  observations,  Y,  and  the  matrix  of  known  constants,  X. 

As  previously  indicated,  using  the  method  of  least 
squares  to  minimize  (Y-X3) ' (Y-X3)  yields  the  normal  equations, 

X'X3  =  X'Y. 
Regardless  of  the  rank  of  X,  properties  (4)  and  (8)  give  us 
a  general  solution  for  the  least  squares  estimator  of  3/ 

3  =  (X'X)+X'Y  +  [I-(X'X)+(X,X) ]Z  =  X+Y  +  (I-X+X)Z, 
where  Z  is  an  arbitrary  p  x  1  vector.   It  is  apparent  that 
if  X  is  of  full  rank,  then  by  properties  (9)  and  (10), 

3  =  x+Y  =  (X'X) ~1X'Y, 
which  confirms  the  result  derived  in  Chapter  I.   However,  we 
will  not  set  the  full  rank  case  apart,  for  the  general  solu- 
tion holds  for  rank  (X)  =  r  _<  p. 

Least  squares  estimation  does  not  directly  provide  an 

2  2 

estimate  of  a  .   However,  by  basing  our  estimate  of  a   on  B , 

2 

an  unbiased  estimate  of  a   is 

-2  =  (Y-X3)  '  (Y-X3)  _:  Y'  (I-XX+)Y 
n  -  r         n  -  r 

Of  immediate  interest  is  the  mean  vector  and  variance- 

covariance  matrix  of  the  estimator  3.   Recalling  that  Y  has 
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mean  vector  X3  and  variance-covanance  matrix  a  I, 


E(§)  =  E [X+Y+(I-X+X) Z]  =  X+X3  +  [I-X+X]Z 


and 


Var(g)  =  E[(X+Y-X+X3) (X+Y-X+X3) ']  =  E[X+ee'X+']  =  a2(X'X)+. 
It  is  noted  that  3  is  not,  in  general,  an  unbiased  estimator. 
However,  if  the  vector  3  happens  to  equal  our  choice  of  the 
arbitrary  vector  Z,  the  estimator  3  is  unbiased. 

If  we  now  specify  the  distribution  of  the  vector  of 

2 
errors  such  that  e^MVN ( 0 , a  I),  it  can  be  shown  that  the 

method  of  maximum  likelihood  estimation  leads  to  precisely 

the  same  results  as  the  least  squares  estimation  of  3. 

2 

By  assuming  that  E(e)  =  0  and  E(ee')  =  a  V,  where  V  is 

an  arbitrary  positive  definite  matrix,  the  estimation  can  be 
somewhat  generalized.   The  normal  equations  stemming  from 
least  squares  estimation  in  this  more  general  case  are, 

(X'V~1X) 3  =  X'V"1Y, 
with  solutions, 

3=  (x,v~1x)+x,v"1y  +  [I-(X,V~1X)+(X,V_1X) ]Z. 
Furthermore , 

E(3)  =  (x*v"1x)+(x,v"1x)3  +  [I-(X'V"1X)+(X,V"1X) ]Z 
Var(3)  =  E[(3-E(§)) (3-E(3)) '] 

=  E[(X,v"1X)+X,v"1ee,V~1X(X,v"1X)+] 
=  (X,V_1X)+(X,V"1X)  (X'V_1X)+a2  =  a2(X,V_1X)  +  . 
In  each  of  the  distributional  cases  discussed  above,  the 
estimator  of  3  is  not  unique  if  the  rank  (X)  =  r  <  p.   In  the 
less-than-full  rank  case  no  direct  inference  can  be  made  con- 
cerning all  the  values  of  the  elements  of  3.   However,  there 
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are  invariant  quantities  in  any  solution  which  are  called 
estimable  functions. 

Consider  a  linear  combination  of  the  elements  of  3  ,  de- 
noted  by  f'0.   f'3  is  invariant  with  respect  to  the  arbitrary 
Z  if,  and  only  if, 

f  [I-X+X]  =  0. 
If  X  is  of  full  rank,  then  this  relationship  holds  for  all 
f ' .   In  the  less-than-full  rank  case,  all  linear  combina- 
tions of  the  form  where  f  =  b'X  have  this  property  since 

f  [I-X+X]  =  b*  [X-XX+X]  =  0. 
Furthermore,  since  the  nullity  of  [I-X  X]  is  r,  any  basis  for 
the  null  space  of  [I-X  X]  is  a  set  of  r  linearly  independent 
solutions  to  f  [I-X  X]  =  0.   Since  the  rank  of  X  is  r  and 
b'X [I-X  X]  =  0,  all  such  vectors  f  can  be  written  f  =  b'X. 
We  may  define  a  linearly  estimable  function  in  terms  of  this 
invariance  property;  that  is,  a  linear  function  of  the  param- 
eters, f'3,  is  estimable  if,  and  only  if,  f  (I-X  X)  =  0. 
Graybill  (1)  defines  a  linearly  estimable  function  as  a 
function  of  the  unknown  parameters  for  which  there  exists  a 
vector  b'  such  that 

E(b'Y)  =  b'X3  =  f ' 3. 
Clearly,  the  two  definitions  are  equivalent.   Graybill  fur- 
thermore proves  that  it  is  equivalent  to  say  that  a  linearly 
estimable  function  is  a  linear  combination  of  the  parameters 
for  which  a  solution  for  r  exists  to  the  equation 

f  =  r'X'X. 
It  is  this  form  of  the  estimable  function  which  will  be  most 
useful  to  us. 
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Among  other  results  which  Graybill  has  developed  with 
applicability  in  the  generalized  inverse  formulation  are 
the  three  listed  below: 

(1).   The  best  linear  unbiased  estimator  of  an 

estimable  function,  f'3,  is 

f'3  =  f'3 

(2).   The  functions,  f'3,  fIB,  .  ..,  f'3  are  called 

l     2.  m 

linearly  independent  estimable  functions  if  each 

of  the  functions  is  linearly  estimable,  and  f ' , 

f I .  ...,  f  are  linearly  independent  vectors. 
2        m  2  r 

(3).   A  set  of  linearly  independent  estimable 

functions  contains,  at  most,  r  functions,  where 

r  =  rank  (X) . 

A  linearly  estimable  matrix  function,  F'3/  is  a  set  of 
m  linearly  independent  estimable  functions,  1  <_  m  <_  r.   A 
matrix  function  may  be  represented  as 

rl 


F'3  = 


f ' 

r2 


(3)  =  (R'X'X)3 


f 
m 


where  the  f!3  are  linearly  independent  estimable  functions. 
A  result  which  will  be  frequently  utilized  is  that  the 
estimator  of  an  estimable  matrix  function  is  invariant  with 
respect  to  the  arbitrary  matrix  Z ,  which  follows  directly  from 
the  fact  that  each  component  of  F'3  has  this  property. 
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2 

Considering  the  distributional  case  where  e^MVN(0,a  I), 

the  estimator  of  an  estimable  matrix  function  is  always 
unique  and,  therefore,  of  use  in  making  inferences  regarding 
the  values  of  the  estimable  functions.   The  mean  vector  and 
variance-covariance  matrix  of  our  estimator  are,  respectively, 

E(F'|)  =  E(F'X+Y)  =  R'X'(XX+X)3  =  F'B 
and 
Var(F*§)  =  E[  (F'X+Y-F'6)  (F'X+Y-F'  3)  '.]  =  E  (F '  X+ee  '  X+ '  F)  =a2F'F. 
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IV.   TESTS  OF  HYPOTHESES 
We  shall  now  discuss  a  test  of  the  hypothesis  F'3  =  C, 
where  F'3  is  an  m  x  1  vector  of  linearly  independent  estim- 
able functions o   F'8  =  C  is  called  a  linearly  estimable 
hypothesis  if  F'3  is  a  set  of  linearly  independent  estimable 
functions  and  C  is  a  vector  of  known  constants.   To  illustrate 
this  concept,  suppose  we  desire  to  test  the  equality  of  the 
first  three  components  of  3;  that  is, 


HQ:31  -  32 


33. 


If  the  functions, 


and 


f ^3  =  (1  -1  0  0-  •  -0) 3 


f£B  =  (11-2  0"  -0)3 


are  linearly  estimable,  then 


F'3  =  (f^f^)  '&  = 


ei  "e2 


3,  +3^  "22 


=  0 


if,  and  only  if,  3=3=  3     We  require  that  the  linear 
estimable  functions  of  the  hypothesis  be  independent.   To 
have  any  of  these  not  independent  would  be  redundant.   For 
example,  in  the  illustration  above, 

f^3  =  (0  1  -1  0-  •  -0) 3  =  0 
would  test  the  equality  of  32  an^  3o/  which  is  already  being 
tested  by  f'3  and  f'3. 

In  the  full  rank  case,  a  consequence  of  the  form  of  the 
linearly  estimable  function  and  the  estimable  hypothesis  is 
that  the  elements  of  3/  individually,  and  as  a  complete  set, 
are  estimable.   The  hypothesis  that  3  =  C  is  then  estimable. 
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Furthermore,  any  subset  of  the  elements  of  3  equal  to  a 
specified  set  of  constants  may  be  tested  by  choosing  the  ap- 
propriate rows  of  the  p  x  p  identity  matrix  as  the  matrix  F'. 
In  developing  the  likelihood  ratio  test  criterion,  we 

will  be  concerned  only  with  the  normal  theory  case  where 

2 

e^MVN(0,o  I).   The  appropriate  likelihood  function  is  then 


f(e;3,a2)  =  — ^     exp 

(2ttcj  )  ' 


-(Y-X3)  '  (Y-XB) 


2a2 


To  develop  a  test  of  the  hypothesis  H^.:  F'B  =  C  versus  the 

alternative,  H,:F'3  ^  C ,  we  will  utilize  the  likelihood 

ratio, 

.  L(w) 


L(n) 

L(fi)  is  the  maximum  value  of  the  likelihood  function  with 

2 

the  parameters  contained  in  the  (3, o    )    p  +  1  dimensional 

space,  ft,    which  is  unrestricted.   L(co)  is  similarly  the  maxi- 
mum value  of  the  likelihood  function  with  the  parameter 
space  restricted  by  H^.   The  co  space  is  p  -  m  +  1  dimen- 
sional because  values  for  m  independent  relationships  among 
the  elements  of  3  are  specified  by  the  hypothesis.   The  maxi- 
mizing values  of  the  parameters  in  the  co  space  shall  be  de- 

~2 
noted  by  3  and  a  ,  while  the  corresponding  values  in  the  Q 

space  shall  be  denoted  by  3  and  a    . 

In  the  restricted  parameter  space  we  desire  to  maximize 

2  2 

f(e;3,a  )  with  respect  to  a   and  3  subject  to  the  restraint, 

F'3  =  C. 

Let 

4>(e;3,a2)  =  loge  [f  (e  ;  3  ,a2)  ]  . 
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It  follows  that 

9(j>   =  (Y-Xg)  '  (Y-Xb) n_  _  Q> 

3  a         2  a         2  a 

2 

The  resulting  maximizing  value  of  a   for  any  given  3  is 

~2    (Y-Xg) ' (Y-XB) 


n 

Examining  the  likelihood  function  it  is  noted  that  the 
maximum  value  of  the  likelihood  function  with  respect  to  6 
with  o2   =   o2   =    (Y-X3) ' (Y-X3)/n  occurs  when  (Y-X3) ' (Y-X3)  is 
a  minimum.   However,  we  must  constrain  3  by  the  relationship , 
F'3  =  C   The  problem  of  finding  the  value  of  3  which  maxi- 
mizes the  likelihood  function  within  the  hypothesis  constraint 
can  then  be  formulated  as  the  quadratic  program, 

min  (Y-X3) ' (Y-X3) , 
subject  to:  F'3   =  C. 
The  derivatives  of  the  Lagrangian  of  the  objective  function 
are : 

||-  =  -2Y'X  +  23'X'X  -  2A'F'  =  0  (4.1) 

a  p 

—  =  F'3  -  C  =0,         (4.2) 

which  are  the  necessary  and  sufficient  conditions  for  a  mini- 
mization where  2A1  is  the  appropriate  vector  of  Lagrange 
multipliers*   We  first  premultiply  eqn.  (4.1)  by  AX  ', 
yielding 

-AY  +  F'3-  AA' A  =  0. 
Substituting  C,  from  eqn.  (4.2),  for  F'3,  the  solution  for  A 
is 


A  =  (AA' )+C  -  A+,Y. 
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Returning  to  eqn.  (4.1)  with  this  value  of  A,  and  solving 
for  3,  yields 

3  =  (X*X)+X,A*  (A,+A+C-A,+Y)  +  X+X,+X'Y  +  (I-X+X)Q 
=  X+A+C-X+A+AY  +  X+Y  +  (l-X+X)Q,  (4.3) 

where  Q  is  an  arbitrary  p  x  1  vector. 
Since 

-2X,Y+2X,X[X+A+C-X+A+AY+X+Y+(I-X+X)0]-2X,A, [ (AA* )+C-A+,Y]  =  0, 
-X'Y+X'A+C-X,A+AY+X'Y-X,A+C+X,A,A+,Y  =  0, 
clearly,  eqn.  (4.1)  is  satisfied  by  3  and  X.      Also 
F'B  =  R'X,X[X+A+C-X+A+AY+X+Y+(I-X+X)Q] 
=  AA+C-AA+AY+AY. 
Since  A  is  an  m  x  n  matrix  of  rank  m, 

AA+  =  I, 
and  eqn.  (4.2)  is  also  satisfied. 

Returning  to  the  likelihood  function  and  substituting 

~  2 

a  ,  while  retaining  the  general  term,  3,  we  find  that 

t  t~\  */   •«   2%    n   exp[-n/2] 

L(co)  =  max     f(e;3/0  )  =  —pr± ' — *- — ry 

2  (2TT)n//[(Y-X3)  '  (Y-X3)  ]n/ 

O     £00 

F'  3  =  C 

2 

Finding  the  maximizing  values  of  a   and  3  in  the  unre- 
stricted parameter  space  yields  the  results, 

3  =  X+Y  +  (I-X+X)Z  (4.4) 

-2=  (Y-X3) ' (Y-X3) 


n 

^2 
where  Z  is  an  arbitrary  p  x  1  vector,  and  a   is  a  biased 

2 

estimator  of  a  .   Combining  these  results  in  the  likelihood 

function  yields  the  maximum  value, 
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t  tn\  */      o      2\         n   /    exp[-n/2] 

L(fi)  =  max  f  (e;S,o  )  =  -yj ^ -yj   . 

q2^  (27T)n/^[(Y-X3)  '  (Y-XB)  ]WZ 


The  likelihood  ratio  is  then 

L(<d)     f(Y-XB)  '  (Y-XB) 


in/2 


L(fi)     L(Y-X6) ' (Y-XB) 
where  3  and  3  are  as  presented  in  equations  (4.3)  and  (4.4) 
respectively. 

To  develop  our  test  criterion  we  will  first  manipulate 
the  quadratic  forms  of  the  likelihood  ratio  to  form  chi- 
square  random  variables.   The  independence  of  the  quadratic 
forms  will  then  be  demonstrated  and  the  test  criterion  estab- 
lished as  a  non-central  F  statistic. 

For  any  3 / 

(Y-XB) ' (Y-XB)  =  (Y-XB) ' (Y-XB)  +  (3-3) ' (X'X)  (3-3)  . 
As  a  consequence,  the  likelihood  ratio  can  be  written  as 


1+(B-B) 'X'X(B-B) 


n/2 


(4.5) 


(Y-XB)  *  (Y-XB) 

Let  us  now  examine  the  ratio  in  the  denominator.   Using 
the  solutions  for  3  and  3/  it  follows  that 
(3-3)X'X(B-3) 

=  (XX+Y-XX+A+C+XX+A+AY-XX+Y) ' (XX+Y-XX+A+C+XX+A+AY-XX+Y) 

=  (XX+A+AY-XX+A+C) ' (XX+A+AY-XX+A+C) 

=  (AY-C) ' (A+,XX+A+) (AY-C) . 

By  the  definition  of  a  non-central  chi-square  (X   ) 

2 

random  variable,  if  Z^MVN(y,V),  then  Z'BZ^x   (q,A),  where 

q  is  the  rank  of  BV  and  X  =  y'By/2,  if,  and  only  if,  BV  is 
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idempotent.   Then  if  [ (A+ ' XX+A+) /o2] [Var (AY-C) ]  is  idempotentf 

~  ~       ~  ~    2 
(3-3) 'X'X(3-3) /a      has  a  non-central  chi-square  distribution. 

Since   Y^MVN(X3,a2I) ,     (AY-C) ^MVN (AX3-C ,a2AA' )    and 

+        +   + 

T1TT  ".  XX        A  Z   _     _      , 

BV  =   s a   AA'  . 

a 

We  must  determine  whether,  or  not,  A  'XX  A  AA'  is  idempotent. 
It  is  clear  that 

A+,XX+(A+AA*  )  =  A+,XX+(A')  =  A+,XX+(XR), 
where  R'X'X  =  AX.   Therefore, 

A+ ' XX+A+AA '  =  A+ ' XR  =  A+ ' A ' . 
Since  A'  is  a  p  x  m  matrix  of  rank  m,  it  has  a  left  inverse, 


A  ' ,  and 


BV  =  A+IXX+A+AA'  =  I. 


Hence,  BV  is  idempotent  and  of  rank  m.   We  can  therefore 
conclude  that  (AY-C) ' (A+I XX+A+) (AY-C) /a2  has  the  non-central 
chi-square  distribution  with  m  degrees  of  freedom  and  non- 
centrality  parameter,  A  =  (F' 3-C) ' (A+' XX+A+) (F1 3-C) /2a2 . 

The  quadratic  form,  (Y-X3) ' (Y-X3)  is  equal  to  Y'(I-XX+)Y. 
Since  (I-XX+)  is  idempotent  of  rank  n-r,  (Y-X3) ' (Y-X0)  has 
the  central  chi-square  distribution  with  n-r  degrees  of  free- 
dom.  Consequently,  our  likelihood  ratio  is 

n/2 

(4.6) 


1+X 

? 

where  y^x'2 (m, (F1 3-C) ' (A+? XX+A+) (F' 3-C) /2a2 )  and  the  variable 

C^X  (n-r) . 

In  general,  two  quadratic  forms  T'DT  and  T'ET  are  inde- 
pendent if,  and  only  if,  DVE  =  0,  where  T^MVN(y,V). 
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The  identities , 


and 


Y  =  (3-3) 'X'xte-eJ/a2  =  (Y-X3)XX+(Y-X3)/a2, 


c,    =    (Y-X3)  •  (Y-X3)/a2  =  (Y-XB)  '  (I-XX+)  (Y-X3)/a2 


are  convenient  forms  for  demonstrating  the  independence  of 
y  and  £ .   Since 

(Y-X3)  =  (I-XX++XX+A+A)Y-XX+A+C  , 

2 
and  the  variance  of  Y  is  a  I,  then 

Var(Y-X3)  =  a2 (I-XX++XX+A+A) (I-XX++XX+A+A) '  =  a2 (I-XX++A+A) . 
The  quantity  corresponding  to  DVE  is  then 


fx'x" 


+  .  ,+. 


(I-XX+A  A) 


I-XX 


+  A 


I 


=  o, 


as  can  be  easily  verified. 

The  ratio  of  the  non-central  chi-square  variable  to 
the  central  chi-square  variable  in  the  denominator  of  our 
likelihood  ratio  with  each  variable  divided  by  its  degrees 
of  freedom  has  the  non-central  F  distribution  with  m  and  n-r 
degrees  of  freedom  and  non-centrality  parameter, 

A  =  (F'3-C) ' (A+,XX+A+) (F'3-C) . 

A  necessary  and  sufficient  condition  that  A  =  0  is  that  the 

hypothesis  is  true;  that  is,  F'ft-C  =0.   In  this  event,  the 

ratio  of  the  quadratic  forms  divided  by  their  degrees  of 

freedom  has  a  central  F  distribution, 

y(n-r)  ..  (AY-C) ' (A,+XX+A+) (AY-C)  ^ 

C  (m)        (Y-X3)' (Y-X3)  '  m,n-r 

Since  rejection  of  the  hypothesis  is  consistent  with  small 

values  of  the  likelihood  ratio  and  the  likelihood  ratio  is 

monotonic  decreasing  in  [y (n-r) ]/[? (m) ) ] ,  the  critical  region 
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for  this  test  is 


^f:  F-  --<«>- 


m,n-r 

where  F       (a)  is  the  upper  significance  point  correspond- 
ing to  significance  level  a. 

As  substantiation  of  our  likelihood  ratio  criterion,  we 
shall  compare  it  with  commonly  used  criteria  for  two  types 
of  tests  in  the  full  rank  case.   The  common  form  of  the  full 
hypothesis  test  criterion  for  3=3*,  Graybill  (1)  for  exam- 
ple, is 

n-p  (Y-X3*) (X(X'X)~1X') (Y-X3*)  s  „ 

~d   " + + d  n-o(a) 

P      (Y-XX+Y) ' (Y-XX+Y)  P'n  P 

It  is  apparent  that  the  denominators  are  equal.   Furthermore, 

(Y-X3*)  '  (X(X,X)"1X')  (Y-X3*)  =  (Y-X3*)'(XX+)  '  (XX+)  (Y-X3*) 

=  (X+Y-3*) ' (X'X) (X+Y-3*)  . 
Therefore  the  test  criteria  are  equivalent  in  the  full 
hypothesis  test. 

To  test  that  s  of  the  elements  of  3  are  equal  to  known 
constants,  while  the  other  p  -  s  elements  are  unspecified, 
Graybill  (1)  uses  the  subhypothesis  criterion, 

n-p  (Y-x1y1*)'(x(x'x)-1x'-x2(x2'x2)-1x2')(Y-xlYl*) 

tl  _ >  p      (a) 

S  (Y-X^*)  '  (I-X(X'X)  1X')(Y-X1y1*)  S  '      P 

For  the  purpose  of  this  full  rank  test,  the  elements  of  the 
general  linear  model  have  been  partitioned  such  that 

x  =  (x1,x2) ,  3'  =  (Y1/Y2) 

and  the  hypothesis  is  y,  =  Yt*/  where  y-i  *  is  an  s  x  1  vector 
of  known  constants. 
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In  the  generalized  inverse  formulation  this  hypothesis 
may  be  tested  by  setting  C  =  Yi*  anc^  choosing  F'  of  the 
form  F1  =  [I  •  0],  where  the  identity  matrix  is  of  dimen- 
sion s  and  the  null  matrix  has  dimensions  s  x(p-s).   As  a 
simplification  we  shall  consider  only  the  full  rank  case 
where  Y,  is  orthogonal  to  y^ ;  that  is,  X-jXo  =  0.   Since 
F1  =  AX  =  [I  i    0] ,  we  must  have 

AX1   =    I 

AX2  =  0. 
By  property  (11)  of  Chapter  II,  it  is  then  clear  that 
A  =  X,  .   Our  hypothesis  is  then  F'g  =  Yi*»  an<3  the  test 
criterion  is 

(x1"y-y1*)'(x'xx+x1)  (x{y->*) 

n-p      l'l      1      1    l'l    ^  _      /N 
— c   >  F    ^  (a)  . 

S  (Y-XJ3)  '  (Y-xe)  s,n  p 

Again  it  is  apparent  that  the  denominators  are  equal. 
The  numerators  are  also  equal  since 

(Y-X1Y1*) ' (X(X,X)"1X,-X2(X2,X2)"1X2,) (Y-X1y1*) 

=  (y-xiY;l*)  '  (xx+-x2x2+)  (y-xiY;l*) 

=  (Y-XlY;L*)  '  (X1X1+)  (Y-XlYl*) 

=  (Y-XlYl*)  ,X|+(Xj_XX+X1)X1+(Y-XlYl*) 

=  (X+Y-Y*)  '  (XpX+X1)  (X*Y-Y]_*)  . 

The  equivalence  of  our  test  criterion  to  the  commonly 
used  criterion  in  this  special  case  of  the  subhypothesis 
test  is  then  clear. 
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APPENDIX  A 
Estimation  and  Hypothesis  Testing  -  An  Illustration 

Consider  a  completely  randomized  experimental  design 
model,  where 

Yij  =  y  +  Ti+eij  i  =  1/2'3 

J  -  1,2 

We  shall  suppose  that  the  required  normality  assumptions 

are  justifiable  for  our  statistical  model, 

Y  =  Xg  +  e 

where 

e  ^  MVN(0,a2I)  . 

The  necessary  elements  of  our  statistical  model  are: 


Design  Structure 


Y  = 


3  = 


y 

Ti 

T2 

;T3. 

X  = 


Observations 
l" 
3 
2 
2 
8 
10 


The  rank  of  X  is  r  =  3.   By  solution  of  the  equations  de- 
fining the  generalized  inverse, 

111111" 
3   3-1-1  -1  -1 
-1-1   3   3  -1  -1 
-1  -1  -1-1   3   3 


1 

1 

0 

0 

1 

1 

0 

0 

1 

0 

1 

0 

1 

0 

1 

0 

1 

0 

0 

1 

1 

0 

0 

1 

X"1"  =  1/8 


27 


and   (X'X)   =  1/64 


6    2    2    2" 

2   22  -10  -10 

2  -10   22  -10 

2    -10  -10   22_ 

As  this  is  an  example  of  the  less-than-full  rank  case, 
the  estimate  is  not  unique.   The  estimator  has  the  general 
solution , 


3  =  X+Y  +  [I-X+X]Z 


=  1/8 


26' 

-10 

-10 

46_ 

+  1/8 


2  -2  -2  -2 
-2222 
-2222 
-2222 


The  mean   vector   of    3    is 
E(3)    =   X+X3    =    [I-X+X]Z 


=  1/8 


6y+2T,+2T2+2T3 
2y+6x,  -2t2-2t-d 
2y-2T, +6t2~2t3 
2h-2t, -2t2+6t3 


2.A-.  —  2.ii~  —  2.  L  ~~  JLu  j, 
-2Z1+2Z2+2Z3+2Z4 
-2Z1+2Z2+2Z_+2Z4 
-2Z]+2Z2+2Z3+2Z4 


and  the  variance-covariance  matrix  of  3  is 

3   111' 


Var  (3)  =  a2(X'X)+  =  a2/32 


1  11  -5  -5 
1  -5  11  -5 
1  -5  -5  11 


Furthermore,  the  unbiased  estimate  of  a   is 

^2  =  (Y-X3) ' (Y-X3)  =  4/3 
n-r 
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The  estimator  of  3  in  the  less-than-full  rank  case  by 
itself  does  not  tell  us  anything  about  the  individual  com- 
ponents of  3-   However,  consider  the  estimate  of  a  function 
of  parameters ,  where 

R1  =  (O  1/2  -1/2   0' 
L0  1/2   1/2  -1 
A  =  R'X'  =  '1/2  1/2  -1/2  -1/2   0   0 
1/2  1/2   1/2   1/2  -1  -1 


and 


F1  =  AX 


'0  1-1   0\ 
0  1   1-2 
The  estimator  of  the  estimable  matrix  function,  F'3,  is 


F*  3  =  FX+Y  = 


oN 

14 


with  mean  and  variance-covariance  matrix 


E(F'B)    =   P' 3   = 


T1"T2 
lTl+T2~2T3 


Var    (F'3)    =    a2F'F   = 


0    6 


It  is  commonly  desired  to  test  the  equality  of  all  the  t. , 
the  treatment  effects-   In  this  event,  the  corresponding 
estimable  hypothesis  is 


F'  3  = 


0  1-1   0 

0  1   1-2 


3  = 


T1"T2 

lTl+T2-2T 


'0^ 
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For  this  estimable  hypothesis,  the  generalized  inverse  of 

A'  is 

rl/2  1/2  -1/2  -1/2    0     0 
A+I  = 

Ll/6    1/6      1/6      1/6    -1/3   -1/3 

The    test   statistic, 


fn-rl    y   =    ( 
{  m  J    c         I 


1  =     n"r 
C         1    m 


(AY-C) ' (A+,XX+A+) (AY-C) 
(Y-X§) ' (Y-X3) 


=    24.5 


The  critical  region  for  this  test  is 


n-r 


m 


%r  >    F~  o(.05)  =  9.55. 


e   "2,3 

We  must  therefore  reject  the  hypothesis,  H~  :  t,  =  t2  =  t^ 
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